Goto

Collaborating Authors

 carnegie mellon university


SOFA-FL: Self-Organizing Hierarchical Federated Learning with Adaptive Clustered Data Sharing

Ni, Yi, Wang, Xinkun, Zhang, Han

arXiv.org Artificial Intelligence

Federated Learning (FL) faces significant challenges in evolving environments, particularly regarding data heterogeneity and the rigidity of fixed network topologies. To address these issues, this paper proposes \textbf{SOFA-FL} (Self-Organizing Hierarchical Federated Learning with Adaptive Clustered Data Sharing), a novel framework that enables hierarchical federated systems to self-organize and adapt over time. The framework is built upon three core mechanisms: (1) \textbf{Dynamic Multi-branch Agglomerative Clustering (DMAC)}, which constructs an initial efficient hierarchical structure; (2) \textbf{Self-organizing Hierarchical Adaptive Propagation and Evolution (SHAPE)}, which allows the system to dynamically restructure its topology through atomic operations -- grafting, pruning, consolidation, and purification -- to adapt to changes in data distribution; and (3) \textbf{Adaptive Clustered Data Sharing}, which mitigates data heterogeneity by enabling controlled partial data exchange between clients and cluster nodes. By integrating these mechanisms, SOFA-FL effectively captures dynamic relationships among clients and enhances personalization capabilities without relying on predetermined cluster structures.


Leveraging Large Language Models for Identifying Knowledge Components

Wang, Canwen, Lin, Jionghao, Koedinger, Kenneth R.

arXiv.org Artificial Intelligence

Knowledge Components (KCs) are foundational to adaptive learning systems, but their manual identification by domain experts is a significant bottleneck. While Large Language Models (LLMs) offer a promising avenue for automating this process, prior research has been limited to small datasets and has been shown to produce superfluous, redundant KC labels. This study addresses these limitations by first scaling a "simulated textbook" LLM prompting strategy (using GPT-4o-mini) to a larger dataset of 646 multiple-choice questions. We found that this initial automated approach performed significantly worse than an expert-designed KC model (RMSE 0.4285 vs. 0.4206) and generated an excessive number of KCs (569 vs. 101). To address the issue of redundancy, we proposed and evaluated a novel method for merging semantically similar KC labels based on their cosine similarity. This merging strategy significantly improved the model's performance; a model using a cosine similarity threshold of 0.8 achieved the best result, reducing the KC count to 428 and improving the RMSE to 0.4259. This demonstrates that while scaled LLM generation alone is insufficient, combining it with a semantic merging technique offers a viable path toward automating and refining KC identification.


AI may blunt our thinking skills – here's what you can do about it

New Scientist

AI may blunt our thinking skills - here's what you can do about it There is growing evidence that our reliance on generative AI tools is reducing our ability to think clearly and critically, but it doesn't have to be that way Socrates wasn't the greatest fan of the written word. Famous for leaving no texts to posterity, the great philosopher is said to have believed that a reliance on writing destroys the memory and weakens the mind . Some 2400 years later, Socrates's fears seem misplaced - particularly in light of evidence that writing things down improves memory formation . A growing number of psychologists, neuroscientists and philosophers worry that ChatGPT and similar generative AI tools will chip away at our powers of information recall and blunt our capacity for clear reasoning. What's more, while Socrates relied on clever rhetoric to make his argument, these researchers are grounding theirs in empirical data.


Meet the Chinese Startup Using AI--and a Small Army of Workers--to Train Robots

WIRED

AgiBot is using AI-powered robots to do new manufacturing tasks. Smarter machines may transform physical labor in China. AgiBot, a humanoid robotics company based in Shanghai, has engineered a way for two-armed robots to learn manufacturing tasks through human training and real-world practice on a factory production line. The company says its system, which combines teleoperation and reinforcement learning, is being tested on a production line belonging to Longcheer Technology, a Chinese company that manufactures smartphones, VR headsets, and other electronic gadgets. AgiBot's project shows how more advanced AI is starting to change the abilities of industrial machines--an innovation that may creep into new areas of manufacturing in China and elsewhere.


CFL: On the Use of Characteristic Function Loss for Domain Alignment in Machine Learning

Almansour, Abdullah, Tonguz, Ozan

arXiv.org Artificial Intelligence

Machine Learning (ML) models are extensively used in various applications due to their significant advantages over traditional learning methods. However, the developed ML models often underperform when deployed in the real world due to the well-known distribution shift problem. This problem can lead to a catastrophic outcomes when these decision-making systems have to operate in high-risk applications. Many researchers have previously studied this problem in ML, known as distribution shift problem, using statistical techniques (such as Kullback-Leibler, Kolmogorov-Smirnov Test, Wasserstein distance, etc.) to quantify the distribution shift. In this letter, we show that using Characteristic Function (CF) as a frequency domain approach is a powerful alternative for measuring the distribution shift in high-dimensional space and for domain adaptation.


Meta-Learning for Cross-Task Generalization in Protein Mutation Property Prediction

Badrinarayanan, Srivathsan, Su, Yue, Ock, Janghoon, Pham, Alan, Ahuja, Sanya, Farimani, Amir Barati

arXiv.org Artificial Intelligence

Protein mutations can have profound effects on biological function, making accurate prediction of property changes critical for drug discovery, protein engineering, and precision medicine. Current approaches rely on fine-tuning protein-specific transformers for individual datasets, but struggle with cross-dataset generalization due to heterogeneous experimental conditions and limited target domain data. We introduce two key innovations: (1) the first application of Model-Agnostic Meta-Learning (MAML) to protein mutation property prediction, and (2) a novel mutation encoding strategy using separator tokens to directly incorporate mutations into sequence context. We build upon transformer architectures integrating them with MAML to enable rapid adaptation to new tasks through minimal gradient steps rather than learning dataset-specific patterns. Our mutation encoding addresses the critical limitation where standard transformers treat mutation positions as unknown tokens, significantly degrading performance. Evaluation across three diverse protein mutation datasets (functional fitness, thermal stability, and solubility) demonstrates significant advantages over traditional fine-tuning. In cross-task evaluation, our meta-learning approach achieves 29% better accuracy for functional fitness with 65% less training time, and 94% better accuracy for solubility with 55% faster training. The framework maintains consistent training efficiency regardless of dataset size, making it particularly valuable for industrial applications and early-stage protein design where experimental data is limited. This work establishes a systematic application of meta-learning to protein mutation analysis and introduces an effective mutation encoding strategy, offering transformative methodology for cross-domain generalization in protein engineering.


SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures

Yuan, Kuang, Wang, Yifeng, Zhang, Xiyuxing, Shen, Chengyi, Kumar, Swarun, Chan, Justin

arXiv.org Artificial Intelligence

Imagine placing your smartphone on a table in a noisy restaurant and clearly capturing the voices of friends seated around you, or recording a lecturer's voice with clarity in a reverberant auditorium. We introduce SonicSieve, the first intelligent directional speech extraction system for smartphones using a bio-inspired acoustic microstructure. Our passive design embeds directional cues onto incoming speech without any additional electronics. It attaches to the in-line mic of low-cost wired earphones which can be attached to smartphones. We present an end-to-end neural network that processes the raw audio mixtures in real-time on mobile devices. Our results show that SonicSieve achieves a signal quality improvement of 5.0 dB when focusing on a 30° angular region. Additionally, the performance of our system based on only two microphones exceeds that of conventional 5-microphone arrays.


Enhancing Construction Site Analysis and Understanding with 3D Segmentation

Vasanthawada, Sri Ramana Saketh, Liu, Pengkun, Tang, Pingbo

arXiv.org Artificial Intelligence

Monitoring construction progress is crucial yet resource-intensive, prompting the exploration of computer-vision-based methodologies for enhanced efficiency and scalability. Traditional data acquisition methods, primarily focusing on indoor environments, falter in construction site's complex, cluttered, and dynamically changing conditions. This paper critically evaluates the application of two advanced 3D segmentation methods, Segment Anything Model (SAM) and Mask3D, in challenging outdoor and indoor conditions. Trained initially on indoor datasets, both models' adaptability and performance are assessed in real-world construction settings, highlighting the gap in current segmentation approaches due to the absence of benchmarks for outdoor scenarios. Through a comparative analysis, this study not only showcases the relative effectiveness of SAM and Mask3D but also addresses the critical need for tailored segmentation workflows capable of extracting actionable insights from construction site data, thereby advancing the field towards more automated and precise monitoring techniques.


Noninvasive brain tech and AI moves robotic hand with thought

FOX News

Thanks to a team at the University of California, Davis, there's a new brain-computer interface (BCI) system that's opening up real-time, natural conversation for people who can't speak. Noninvasive brain tech is transforming how people interact with robotic devices. Instead of relying on muscle movement, this technology allows a person to control a robotic hand by simply thinking about moving his fingers. Instead, a set of sensors is placed on the scalp to detect brain signals. These signals are then sent to a computer.


Hello Afrika: Speech Commands in Kinyarwanda

Igwegbe, George, Awojide, Martins, Bless, Mboh, Kadzo, Nirel

arXiv.org Artificial Intelligence

Voice or Speech Commands are a subset of the broader Spoken Word Corpus of a language which are essential for non-contact control of and activation of larger AI systems in devices used in everyday life especially for persons with disabilities. Currently, there is a dearth of speech command models for African languages. The Hello Afrika project aims to address this issue and its first iteration is focused on the Kinyarwanda language since the country has shown interest in developing speech recognition technologies culminating in one of the largest datasets on Mozilla Common Voice. The model was built off a custom speech command corpus made up of general directives, numbers, and a wake word. The final model was deployed on multiple devices (PC, Mobile Phone and Edge Devices) and the performance was assessed using suitable metrics.